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Abstract 



Independent component analysis (ICA) has proven useful for modeling brain and electroen- 
cephalographic (EEC) data. Here, we present a new, generalized method to better capture the 
dynamics of brain signals than previous ICA algorithms. We regard EEC sources as eliciting 
spatio-temporal activity patterns, corresponding to, e.g., trajectories of activation propagating 
across cortex. This leads to a model of convolutive signal superposition, in contrast with the 
commonly used instantaneous mixing model. In the frequency-domain, convolutive mixing is 
equivalent to multiplicative mixing of complex signal sources within distinct spectral bands. 
We decompose the recorded spectral-domain signals into independent components by a com- 
plex infomax ICA algorithm. First results from a visual attention EEC experiment exhibit (1) 
sources of spatio-temporal dynamics in the data, (2) links to subject behavior, (3) sources with 
a limited spectral extent, and (4) a higher degree of independence compared to sources derived 



by standard ICA. 
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1 Introduction 

Independent component analysis (ICA) is effective in analyzing b rain signals and in particular elec- 
troencephalographic (EEG) data (e.g.. iMakeig et all Il99tl 12002^) . and ICA continues to be useful 
for building new models of experimental data. However, ICA algorithms presently applied to brain 
data rely on several idealized assumptions about the underlying processes that may not be fully 
applicable. Although the results so far obtained with ICA are significant and justify its continued 
use, it is nevertheless desirable to advance the ICA methodology by allowing more realistic modeling 
of EEG dynamics. 

One principal limitation imposed on ICA algorithms is the mixing process by which the source 
signals are assumed to be superimposed to form the measured sensor signals. Presently, ICA analysis 
of brain data is carried out assuming a linear and instantaneous mixing process that can be expressed 
mathematically as multiplication by a single mixing matrix. The physics of electromagnetic wave 
propagation support instantaneous summation at the electrode sensors since c apacitive effects w ithin 
the head are generally regarded as negligible at EEG frequencies of interest llLagerlundlll999|) . 

In the standard ICA model the component signal sources are thought of as neural activity oc- 
curring in a perfectly synchronized manner within spatially fixed cortical domains. This assumption 
might be too strong, as it does not take into account the possible spatio-temporal dynamics of the 
underlying neural processes, e.g., propagation of neuronal activity, traveling wave patterns of ac- 
tivity, or synchronization between different brain areas with a non-zero phase lag |Freemanl Il97.4 
lLonez da Silva and Storm van LeeuwenI Il978t lArieli et"all Il99fit IStein et all l2f)0f]l) . One way to 
allow the effective sources to exhibit more complex dynamics is to assume a convolutive mixing 
model. In a convolutive mixing process, a single impulse-like activation of an EEG component may 
elicit a sequence of potential maps with varying spatial topography; such a model may thereby allow 
for patterns of spatial propagation of EEG activity. Separation of convolutively mixed sources into 
independent EEG components is not feasible under the instantaneous mixing assumption, since the 
temporal autocorrelation of the EEG results in statistical dependencies between the time-courses of 
consecutive potential maps. At best, instantaneous ICA may s eparate moving sourc es into separate 
stationary components with overlapping 'frames' of activation l|Makeig et alll2000|) . 




A fundamentally different phenomenon, also neglected by the standard ICA model, is the spectral 
quality of EEG signals. EEG researchers have long been familiar with the fact that EEG activity 
has distinctive characteristics in different frequency bands (conventionally delta , theta. alpha, beta, 
and gamma) which may be a ssociated with different physiological processes (JRgrger (1929), and 



also IMakeig and Inlowl l|l993fl ). It may therefore be more appropriate to allow for the existence 
of different functionally independent sources in different frequency bands by modeling the source 
superposition with a different mixing matrix for each frequency band. 

To overcome both shortcomings, we approach convolutive independent component analysis of 
EEG signals through complex ICA applied to different spectral bands. Convolutive mixing in 
the time-domain is equivalent to multiplicative mixing in the frequency-domain with generally dis- 
tinct complex-valued mixing coefficients in different frequency bands. Therefore, by moving to the 
frequency-domain, both spatio-temporal source dynamics and frequency-specific source processes 
may be modeled. Solutions obtained under the standard ICA model with instantaneous mixing in 
the time-domain form a subset within the solution space of complex frequency-domain ICA, corre- 
sponding to signal superposition with the same real- valued mixing matrix in all frequency bands. 

The method consists of two processing stages (cf. Fig.^l. First, the measured EEG signals are 
decomposed into different spectral bands by short-time Fourier transformation or wavelet transfor- 
mation, yielding a complex-valued spectro-temporal representation for each electrode signal. Then, a 
separate independent component analysis is performed on the complex frequency-domain data within 
each spectral band, producing, for each band, a set of complex independent component activation 
time-courses and corresponding complex scalp maps. We also investigate the case of a constrained 
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Figure 1: Schematic representation of the processing stages of the complex spectral-domain ICA al- 
gorithm. Left ('spec'): the recorded electrode signals are decomposed into different spectral bands. 
Center ('cICA'): Complex ICA decomposition is performed within each spectral band. Right: Iter- 
ation steps performed by complex ICA for estimation of each separating matrix W(/). 



complex ICA algorithm where the independent component (IC) activations remain complex, but the 
IC scalp maps are required to be real- valued. 

Within each spectral band, the proposed algorithms find a number of maximally independent 
components equal to the number of employed data channels. Hence, across frequencies the method 
has the potential of identifying more independent processes than the number of electrodes. But 
since EEG processes may not be narrow-band, but may exhibit dynamics within multiple contiguous 
or disconnected bands, independent components at different frequencies might also originate from 
the same spatial EEG generator sourc es. This could, e.g., be the case for mu- rhythm activity 
(|Niedermever and Lones da Silvai Il999l) which is characterized by concurrent activity near 10 Hz 
and 20 Hz. We present methods for evaluating the similarity of independent components at different 
frequencies and for grouping together those components arising from single physiological processes. 

Convolutive mixing models have been used for blind source separation in other domains. For 
example, in the case of speech signals the physic s of wave propagation in air directly leads to a 
convolutive mixing process (e.g., lAnemullerl . l200l|) . However, speech signals are generally modeled 
as wide-band sources, emitting energy essentially over the entire spectral range of interest. The same 
assumption cannot be made for brain signals, so that corresponding convolutive ICA algorithms 
cannot be applied directly to the problem at hand. On the other hand, narrow-band sources are 
also encountere d in telecommu nications applications, leading to ICA algorithms similar to the one 
presented here l|Torkkolal Il9 98) . However, a strict narrow-band assumption may not be completely 
justified for brain signal sources, as mentioned above. The methods presented in this paper appear 
to be sufficiently flexible to model signals in all the aforementioned scenarios. 

The remainder of the paper is organized as follows: In sections 12 . II and l2~2l we define the spectral 
decomposition and mixing model. We derive a complex variant of the infomax ICA algorithm 
l)Bell and SeinowskiL Il995|) in section 12.31 from the maximum-likelihood principle, and discuss a 
variant constrained to real scalp maps in section 12.3.11 Visualization of complex activations and 
maps is discussed in 12.3.21 Section 12.3.31 defines second and fourth order measures for assessing 
the quality of the separation. In section \2. 41 we present methods for measuring similarities between 
independent components in distinct spectral bands. Section ETSl presents methods for comparing real 
time-domain and complex frequency-domain ICA results. Finally, we apply these methods to data 
from a visual attention task EEG experiment in section [3J 
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2 Methods 

2.1 Spectral decomposition 

Consider measured signals Xi(t), where i = 1,...,M denotes electrodes. Their spectral time- 
frequency representations are computed as 

Xi(TJ)=Y,Xi(? + T)bf(T), (1) 
r 

where / denotes center frequency, and 6/(r) the basis function which extracts the spectral band 
/ from the time-domain signal. The basis function is centered at time T. Hence, data of size 
[channels i x times t] is transformed into data of size [channels i x times T x frequencies /] . 

In this paper, we consider the decomposition by means of the short-time Fourier transformation, 
such that bf(r) is given by 

b f (r) = h(r)e-^^ K , (2) 

h(r) being a windowing function (e.g., a hanning window) with finite support in the interval r = 
—K,...,K—1, and 2K denoting the window length. Correspondingly, the frequency index acquires 
values / = 0, . . . , K. Since the product of time- and frequency-resolution is bounded from below 
by 0.5, the chosen windowing function and window length give limited frequency-domain resolution. 
Hence, variability across frequencies is limited and results should be interpreted accordingly. 

2.2 Mixing model 

For each frequency band / the signals x(T, /) = [x\ (T,f), . . . , xm(T, f)] T are assumed to be gener- 
ated from independent sources s(T, /) = [si(T, /), . . . , sn(T, f)] T by multiplication with a frequency- 
specific mixing matrix A(/), 

x(T,/) = A(/)s(T,/), (3) 

with rank(A(/)) = N. Noise is assumed to be negligible. We restrict the presentation to square- 
mixing, M = N, though our methods are also applicable to the case M > N. The estimates 
u(T, /) of the sources are obtained from the sensor signals by multiplication with frequency-specific 
separating matrices W(/), 

u(T,/) = W(/)x(T,/). (4) 

2.3 Complex ICA 

To derive the complex infomax ICA algorithm, we model the sources Si(T, f) as complex random 
variables with a circular symmetric, non-Gaussian probability density function V s (s). Since the 
phase arg(si(T, /)) depends only on the relative position of the window centers T with respect to 
the time-domain signal Sj(i), the property of circular symmetry of the distribution V s {s) is a direct 
result of the window-centers being chosen independently of the signal. Hence, V s (s) depends only 
on the magnitude |s| of s and can be written as 

V s (s)=g(\s\) (5) 

with the function <?(•) : R — * K being a real- valued function of its real argument. Our investigation 
of the statistics of frequency-domain EEG in section l3~Tl demonstrates the data's positive kurtosis. 
Therefore, we choose V s (s) as a super-Gaussian distribution. The assumed two-dimensional dis- 
tribution V s (s) over the complex plane is illustrated in Fig. |3 Its super-Gaussian nature is best 
seen by plotting the corresponding distribution Pi s i(|s|) of the magnitude \s\ versus the correspond- 
ing distribution (a Rayleigh distribution) for a two-dimensional Gaussian distribution of the same 
variance, as illustrated in Fig. |3| 
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Figure 2: The circular symmetric super-Gaussian probability density function V(s) of the complex 
sources s. 

The separating matrix W(/) is obtained by maximizing the log-likelihood L(W(/)) of the mea- 
sured signals x(T, /) given W(/), which in terms of the source distribution V s is 

L(W(/)) = (logP x (x(T, /)|W(/))) T = logdet(W(/)) + (logP s (W(/) x(T, /))) T , (6) 

where (-) T denotes expectation computed as the sample average over T. We perform maximization 
by complex gradient ascent on the likelihood-surface. The (i, j)-element 5wij(f) of the gradient 
matrix VW(/) is defined as 

^Mra +, ra) 1(w(/,) ' (7) 

where d/d$twij(f) and d/d^swij(f) denote differentiation with respect to the real and imaginary 
parts of matrix element Wij(f) = [W(/)]y, respectively. This results in the gradient 

VW(/) = (I - <v(T, /) u(T, ff) T ) W- H (f), (8) 

however, faster con vergence is achieved by using the complex extension of the natural gradient 
l Amari et allll996(l 



VW(/) = VW(/) W(f) H W(f) = (I - (v(T, /) u(T, f) H ) T ) W(/), (9) 
where v(T, /) is a non-linear function of the source estimates u(T, /): 

v(T,/) = [ Vl (T, /),..., ^(T,/)] T , (10) 
Wi (T, /) = sign( Ml (T, /)) ^fcg^ , (11) 
. / s fo if 2 = 0, 

sign(z) = ' (12) 

I z/|z| if z 7^ 0. 

Here, I denotes the identity matrix, </(■) is the first derivative of function g(-), and H denotes 
complex conjugation and transpositio n. The gradient Eq. ll9l) was pre viously used in an algorithm 



for blind separation of speech signals I Anemiiller and Kollmeieil 12003). 
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Figure 3: The distribution Pui(|s|) of super-Gaussian source magnitude (solid) versus the distribu- 
tion of the magnitude of a two-dimensional Gaussian process with the same variance (dashed). The 
latter is the well-known Rayleigh distribution. The super-Gaussian source distribution is character- 
ized by its stronger peak at small magnitudes and its longer (high-magnitude) tails. 



For the choice 

g'(x) \-e~ x 
g[x) 1 + e x 

we obtain a complex g eneralization of the standard logistic infomax ICA learning rule 
l|Bell and SeinowskiLll995() . The algorithm may be adapted to different (e.g., sub-Gaussian) source 
distributions by use of other appropriate non-linearities g' j 1 g. In the case of purely real-valued data, 
the learning rule for complex data reduces to the infomax ICA learning rule for real signals. 

Due to the circular symmetry of V s , the log-likelihood L(W(f)) is invariant with respect to 
the multiplication of any row Wj(/) of W(/) with an arbitrary unit-norm complex number Ci(/), 
\ci(f)\ = 1. This parallels the sign-ambiguity of real ICA algorithms using symmetric non-linearities. 
However, since the circular symmetry allows for continuous invariance transformations (in contrast 
to the discrete sign- flip operation), detection of convergence is hindered. Therefore, we constrain the 
diagonal of VW(/) by projecting it to the real line, thereby reducing the invariance to a sign-flip 
ambiguity. 

The independent component decomposition based on Eq. JSJ) is performed separately for each 
frequency band /, yielding in total N(K + 1) complex independent component activation time- 
courses Uj(T, /) and N(K + 1) complex scalp maps 8Lj(f), where &j(f) denotes the j-th column of 
the estimated mixing matrix A(/) = W _1 (/). 



2.3.1 Complex ICA constrained to real scalp maps 

The complex scalp maps SLj(f) can be interpreted in terms of amplitude- and phase-differences 
between different spatial positions on the scalp produced by the spatio-temporal dynamics of the 
underlying EEG generators. However, it might also be of value to constrain the scalp maps to be 
real-valued as in standard ICA. In this constrained model of the EEG, sources are assumed to be 
frequency-specific (in contrast to the wide-band source model of standard ICA), but may not elicit 
the spatio-temporal dynamics of the fully-complex model. Together with a simpler interpretation, 
this approach has the advantage of making it possible to further separate the effects induced by wide- 
band versus band-limited data and by instantaneous (real) versus convolutive (complex) mixing. 1 

1 Mathematically, signal superposition by means of different real- valued mixing matrices in distinct frequency bands 
can also be interpreted as convolutive mixing of wide-band sources, but with symmetric filters. However, this special 
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To constrain the algorithm's solution to real scalp maps, the initial estimate of W(/) is chosen 
to be real (typically the identity matrix), and the gradient Eq. (JSJ is projected to the real plane, 
resulting in the constrained gradient 

v R w(/) = n (vw(/)) , (w) 

with 5ft denoting the real part. While the corresponding scalp maps a.j(f) are real, the separated IC 
activations u(T, /) remain complex. 

Eq. (|14|) differs from, e.g., applying standard infomax ICA to the real-parts of u(T, /) in that 
its underlying source model Eq. JSJ is still based on a distribution over the complex plane. As a 
result, the product v(T, /) u(T, f) H in the right hand side of Eq. ijTH is evaluated using complex 
multiplication. In principle, performing complex ICA to derive real-valued component maps might 
be more accurate than performing real ICA on c oncatenated real and im aginary parts of band- limited 
time-frequency transformations as proposed bv lZibulevskv et alJ <|2002f) since the circular symmetric 
complex distribution assumed by complex ICA should be more accurate than the assumption of 
mutual independence between real and imaginary parts used in the real spectral ICA decomposition 
method. 



2.3.2 Visualizing complex IC activations and maps 

Complex independent component activations Uj(T, /) may be conveniently visualized by separately 
plotting their power (squared amplitude) and phase. To simplify the visual impression of the phase 
data, we compensa te for the e f fect o f phase- advances locked to the carrier frequency by complex 
demodulation fe.g.. iBloomfieldl l2000|) . multiplying the IC activations with exp(— i2irfT/2K). This 
yields complex signals in the frequency band centered at Hz, the phase angles of which are plotted . 

For multi-trial data, th is results in two event-related potential (ERP) image plots l|.Iung et all 
Il99flt iMakeig et all Il999af) showing event-related power and phase at each frequency /. For visual 
presentation, the trials are grayscale coded after sorting in order of ascending response time, followed 
by smoothing with a 30-trial wide rectangular window. Response time in each trial is plotted 
super imposed on the data . The time-courses of mean event-related power and intertrial coherence 
(ITC, M akeig et al.l (|2002r )^ may then be computed from the multi-trial data by averaging data from 
identical event-related time-windows across trials. 

To visualize the complex component maps, the invariance of the source model (jSJ) with respect 
to rotation around the origin has to be taken into account. Therefore, for each complex map 
a.j(f) = [aij(f), . . . , OMj{f)] T any rotated version Cj(f) SLj(f) thereof constitutes an equivalent map, 
with Cj(f) an arbitrary unit-norm complex number. For visualization we plot real-part, imaginary- 
part, magnitude and phase values of the equivalent map B.j(f) — Cj(f) &j(f) for which the sum of 
the imaginary parts 3 vanishes and the sum of the real parts 5ft is positive, i.e., 



=3 £««(/)) =0 and £>(a«(/))>0 

i \ i / i 



(15) 



c '- (/) = fi^OT (16) 

A complex map B.j(f) whose elements ay(/) have negligible (near zero) imaginary part for all 
i = 1 , . . . , M indicates that the corresponding EEG process may represent activity of a highly 
synchronized generator ensemble, without phase shifts across the spatial extent of the source. A 

case of convolutive mixing may be too restricted to fully model the possible complexity of the underlying neuronal 
dynamics. Therefore, we adopt the more plausible interpretation that real-valued mixing in different frequency bands 
reflects band-limited processes without spatio-temporal dynamics. 
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non-negligible imaginary part is equivalent to phase-differences between distinct scalp electrode 
positions which may be elicited by spatio-temporal dynamics of the corresponding EEG process, 
e.g., spatial propagation of EEG activity. 



2.3.3 Degree of separation 

To quantify the degree of separation attained, we compute second and fourth order measures of 
statistical dependency. 

Second order correlations are taken into account by computing, for each frequency /, the mean 
p{f) of the absolute values of correlation-coefficients Pij(f) for all different component pairs i =/= j: 



p(f) 



N(N 



where the correlation-coefficients arc defined as 

(u i (T,f)u* j (t,f)) T - fH (f)fi*(f) 



Pij(f) 



*<(/) 



(17) 

(18) 
(19) 
(20) 



Pij(f) vanishes for uncorrelated signals and acquires its maximum (one) only when signals Ui{T, f) 
and Uj(T, f) are proportional. Since the measured signals x(T, /) are complex (except at Hz and 
the Nyquist frequency) , complete decorrelation may in general only be achieved by the fully-complex 
ICA algorithm (Eq.^J, whereas the real-map constrained-complex ICA algorithm fEq. H4fl and time- 
domain ICA will generally exhibit non-zero values of p(f). 

Second order decorrelation is not a sufficient condition for statistical independence. Therefore, 
we (partially) evaluate higher order statistical dependencies by computing the analog quantity p'{f) 
of the time-courses of squared amplitudes \ui(T, f)\ 2 : 



<*(/) = \l (\ui{T, f) — Pi{f)\' 



P'(f) 



N(N — 1) 



where 



(K(t,/)| 2 k(t,/)| 2 ) t -^(/) m ;.(/) 



p'aif) 

/4(/) = <k(T,/)| 2 ) T , 

<{f) = A /((K(T,/)P-^(/)) : 



(21) 

(22) 
(23) 
(24) 



Eq. (1221) measures statistical dependency of fourth or der. It can be interpreted a s a modified 
and normalized variant of a fourth order cross-cumulant ijNikias and Petropuhl Il998h . Its value is 
zero for independent signals, non-zero for signals exhibiting correlated fluctuations in signal power, 
and maximum (one) only for signals with proportional squared-amplitude time-courses (regardless 
of phase). 
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2.4 Corresponding components in distinct spectral bands 

The complex spectral-domain ICA algorithm described above produces separate sets of independent 
components for distinct and comparably narrow spectral bands. Activity in some underlying EEG 
source domains might exhibit strictly narrow-band characteristics. However, generator activity may 
also take place in a broader spectral range comprising contiguous or disconnected spectral bands. 
Narrow-band ICA analysis does not take into account such links between bands, but separates the 
data into independent components ordered arbitrarily (e.g., by band-limited power) in each band. 
Therefore, components that may account for activity within a single underlying EEG generator 
may be captured by components in multiple bands (with possibly distinct component numbers) . To 
obtain a full picture of the underlying EEG processes, it is desirable to identify and group together 
those components in different bands that likely represent activity of the same physiological source. 

In this section, we present methods for identifying and clustering groups of similar components 
across frequencies. The methods are based on appropriate measures of distance between pairs 
of component maps or component activations, respectively. Matching component pairs are then 
identified using a standard optimal-assignment procedure. 



2.4.1 Distance between component maps 

Our definition of the distance between component maps is based on the euclidian distance |aj(/i) — 
a i(/2)| of the complex vectors SLi(fi) and a.,(/ 2 ) representing two maps. Since euclidian distance 
is not invariant with respect to arbitrary rescaling of the maps, it should be normalized. The 
multiplication of one map with an arbitrary unit-norm complex number c, |c| — 1, also alters the 
euclidian distance, although it results in an equivalent map. Therefore, we define the map distance 
rf map (i, /i, j, / 2 ) of maps &i(fi) and aj(/ 2 ) as the rescaled minimal euclidian distance between the 
normalized maps, 

d map (i,fi,j,f2) = ^ m c in 
which is written equivalently in terms of their innner-product as 

jft(caf(A)a 3 (/ 2 )) \ 2 _ [ / |af(A)a 3 (/ 2 )| \ 2 
k(/i)||ai(/ 3 )| ) V \\*i(fi)\\*i(h)\) ' 

The map distance measure attains its maximum (one) for orthogonal maps and its minimum (zero) 
only for equivalent maps. 



a i(/l) a j(/2) 



MA)| Mh)\ 



M = i, 



(25) 



C \ 



2.4.2 Distance between component activations 

We define the distance between complex component activations based on the correlation of signal- 
power time-courses at different frequencies. 2 Between IC activations Ui(T,fi) and Uj(T, / 2 ) at 
frequencies f\ and / 2 , respectively, the component activation distance d ac t(«, ft,j, /b) may be defined 



d«*(*\/i,j,/2) = l-p«(/i,/a), (27) 
where (analogous to Ea. l22|) . pL (/i, fz) denotes the correlation-coefficient of the squared-amplitude 
time-courses |uj(T, /i)| 2 and \uj(T, / 2 )| 2 , 



p'iAh,h) 



(h(T,/ 1 )| 2 K-(T,/ 2 )| 2 > T - M <(/i)MK/2 



^(/iK(/2) 



(28) 



2 Second order correlation drops off sharply with spectral difference because of the orthogonality of the Fourier 
basis and therefore is not appropriate for computing distances across different spectral bands. 
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with ^-(/) and er-(/) denned according to Eqs. ( and lj2"4l) . respectively. By this measure, 
independent signals have maximal distance (one) , whereas signals with highly correlated fluctuations 
in signal power have distance near minimum (zero). Related changes in signal power in different 
frequency bands may be exhibited by EEG generators with activity in both bands, since modulation 
of generator activity — induced, e.g., by experimental events or common modulatory processes — may 
result in synchronous amplitude changes (in the same or different direction) in the participating 
bands. 

2.4.3 Assigning best-matching component pairs 

Based on the distance measures described in sections 12.4.11 and 12.4.21 we define the set of pairs of 
best- matching components to be that which minimizes the average distance between the pairs. 

Consider a given pair of frequencies (/i, $2) and a chosen distance measure d(i, fi,j, $2) (either 
map distance d map or activation distance G? ac t)- Assigning best-matching component pairs is equiva- 
lent to finding the permutation 7r(i), i = 1, . . . ,N, that assigns component i at frequency /1 to com- 
ponent j = n(i) at frequency f% such that the mean distance across all pairs, ^ d{i, fi,ir(i), /2V-/V, 
is minimized: 



Determining given the matrix of distances d(i, fi,j, /2) between all pairs (i, j) is known as the 
'assig nment problem'. A classic algorithm for solvin g this problem is the Hungarian method l|Kuhnl 
Il955l) which we use here following the suggestion of lEnghofll 1 199fl() . 

The minimal mean distance -D(/i,/2) is a global measure of the distance between the sets of 
components at frequencies /1 and f2- For equal frequencies, /1 = f2, -CK/1,/2) always attains its 
minimum (zero), and the permutation becomes the identity, ir(i) = i. If the components at frequency 
/1 are identical to the components at frequency f2, but occur in a different order, then -D(/i, /2) is 
also zero and corresponds to the permutation of order. If some components are identical at both 
frequencies, whereas the remaining components exhibit maximum distance to all other components, 
then D(fi, $2) corresponds to the fraction of non-identical components. For the realistic case of few 
components being reproduced exactly across frequencies and many components matching similar 
but not identical components at other frequencies, D(fi,f 2 ) attains values between zero and one, 
indicating the degree of average similarity of the best-matching component pairs. 

2.5 Time-domain ICA 

We analyze separation results from time-domain ICA using similar methods as those presented in 
sections 2.3.3 and 2.4 for the analysis of frequency-domain ICA. Time-domain infomax ICA is applied 
to the time-domain signals Xi(t), resulting in a single separating matrix W. The corresponding 
components maps are given by the columns &j of the mixing matrix A = W _1 . We obtain frequency- 
specific unmixed signals by applying W to the spectral transforms of the sources, yielding complex 
separated signals u(T, /) = Wx(T, /), from which we compute the measures for the quality of 
separation (cf. section 12. 3. 3|) . Distances between time-domain and spectral-domain components are 
obtained based on the methods presented in section f2.4l The distance d map (i,j, /) between time- 
domain ICA maps a^ and spectral-domain ICA maps a^(/) is computed analogously to Eq. (|25|l . 
Similarly, IC activations obtained with time-domain and frequency-domain ICA are compared by 
computing the distance d ac t(i, j, /) in analogy to Eq. 1)27(1 . We then assign best-matching component 
pairs using the method presented in section \2. 4. 31 




(29) 




(30) 
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Figure 4: Histograms for estimated kurtosis of complex spectral-domain electrode signals (thin 
line) and independent component activations (thick line). Each histogram based on 3131 kurtosis 
estimates (see text), 44 bins of width 0.05 in the interval from to 3. 



3 Results 

In this section, we present results from the analysis of a visual spatial selective attention experiment 
where the subject attended one out of five indicated locations on a screen while fixating a central 
cross, and was asked to respond by a button press as quickly as pos sible each time a targe t stimulus 
appeared in the attended location. For details of the experiment, see lMakeig et alJ 1 199 9b'). Included 
in the analysis were 582 trials from target stimulus epochs collected from one subject. Each epoch 
was 1 s long, beginning at 100 ms before stimulus onset at t = ms. 

The data were recorded from 31 EEG electrodes (each refered to the right mastoid) at a sampling 
rate of 256 Hz and decomposed into 101 equidistantly spaced spectral bands with center frequencies 
from 0.0 Hz (DC) to 50.0 Hz in 0.5-Hz steps. Decomposition was performed by short-time discrete 
Fourier transformation with a hanning window of length 50 samples, corresponding to a spectral 
resolution of 5.12 Hz (defined as half width at half maximum), and a window shift of 1 sample 
between successive analysis windows. This yielded 207 short-time spectra for each trial derived from 
analysis windows centered at times between 1.6 ms and 806.3 ms following stimulus presentations 
in 3.9 ms steps. 

To decompose the data into independent components, the 582 trials were concatenated, resulting 
for each spectral band, / = 0, . . . , 101, and channel, i = 1, . . . , 31, in frames T = 1, . . . , 207 x 582 = 
120474. No pre-training sphering of the data was performed. The separating matrix W(/) was 
initialized with the identity matrix for all spectral bands. We used the logistic non-linearity fEa. ll3f) . 
computed the gradients (Eq. |5Jl and (Eq. I14fl . respectively, at each iteration step from 10 randomly 
chosen data points, and lowered the learning rate of the gradient ascent procedure successively. 
Optimization of W(/) was halted when the total weight-change induced by one sweep through the 
whole data was smaller than 10~ 6 relative to the Frobenius norm of the weight-matrix. 

The dataset was decomposed using both the fully-complex (Eq. |5J) and the real- map constrained- 
complex (Eq. I14f> algorithms. For comparison, the same dataset was also decomposed using time- 
domain infomax ICA applied to the time-domain data Xi(t); the obtained single real separating 
matrix was then applied to the spectral-domain data x(T, /) as described in section l2~5l 
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Figure 5: Residual statistical dependencies evaluated using second order (left panel) and fourth order 
(right panel) measures at frequency bands between Hz and 50 Hz. Residuals for the recorded 
electrode signals (dotted), signal separation obtained from real time-domain infomax ICA (dash- 
dotted), real-map constrained-complex spectral-domain ICA (dashed), and fully-complex spectral- 
domain ICA (solid). 



3.1 Kurtosis 

To test the assumption of super-Gaussian source distributions, we used kurtosis to assess deviations 
from a Gaussian distribution. Kurtosis estimates were computed for spectral-domain data as 



kurt(z) = (|z| 4 )-2((|z| 2 )) 2 -|(zz) 



(31) 



assuming a zero- mean, unit-variance complex random variable z ijHvvarinen et all l200lj) . The 
Kurtosis kurt(z) vanishes for a Gaussian distribution and attains positive and negative values for 
super- and sub-Gaussian distributions, respectively. 

Kurtosis of the spectral-domain electrode signals Xi(T, f) was computed individually for each 
channel i at every frequency /, yielding 31 x 101 = 3131 kurtosis estimates, each based on all 120474 
complex data frames. All of the 3131 channel-frequency kurtosis estimates showed a super-Gaussian 
distribution with minimum 0.02, maximum 23.45 and median 0.43. A histogram of the kurtosis 
values is displayed in Fig. 0] (thin line). 

Analogously, we computed the same number of kurtosis estimates for the IC activations Uj(T, f) 
obtained with the fully-complex ICA algorithm. The median kurtosis increased to 0.55 and only 
super-Gaussian distributions in the range [0.10,386.79] were found. Their histogram is shown in 
Fig- H (thick line). 

These results support our choice of source model, and indicate that only a small advantage might 
be expected by allowing the source distributions to include sub-Gaussian sources. Therefore, we did 
not consider the possibility of sub-Gaussian sources further. 



3.2 Degree of Separation 

To assess the degree of separation achieved by the different ICA algorithms, we computed residual 
statistical dependencies using the second order (Ea. rT7|l and fourth order (Eg. 121)1 statistics described 
in section 12.3.31 Results are displayed in Fig. for the recorded electrode signals and for the sep- 
arations into sources obtained from real time-domain infomax ICA, real-map constrained-complex 
and fully-complex spectral-domain ICA. For both measures and all frequencies, fully-complex ICA 
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Figure 6: Mean distance between the component maps obtained by time-domain infomax ICA and 
best-matching frequency-specific component maps of real-map constrained-complex ICA. Abscissa: 
frequency of spectral-domain component. Ordinate: mean distance to time-domain ICA map. 

achieved the lowest levels of residual dependencies. Real-map constrained results exhibited compa- 
rably higher residuals, and time-domain infomax ICA still higher levels. 

The residual second order correlations exhibited by fully-complex ICA were — with the exception 
of very low frequencies — about one order of magnitude lower than those attained by time-domain 
ICA, and below half of those achieved by real-map constrained-complex ICA. This result may largely 
be explained by the higher number of degrees of freedom of the complex ICA algorithms that model 
the superposition within each frequency band with a different mixing matrix, whereas time-domain 
ICA uses a single matrix for all frequencies. Fully-complex ICA achieved the lowest levels of residual 
correlation since it is the only algorithm that models superposition using a different complex matrix 
for every frequency, which in general is necessary to decorrelate complex input signals. In the 
0-Hz frequency band, the frequency-domain electrode signals are real, which explains the similar 
performance of the real-map constrained- and fully-complex algorithms at the lower end of the 
spectral range. 

The residual fourth order correlations showed a smaller difference between the real-map con- 
strained and fully-complex ICA algorithms, the latter exhibiting slightly lower residual dependen- 
cies for all but very low frequencies. Remarkably, there was almost no difference in fourth order 
correlations between the three algorithms in the range from Hz to approximately 6 Hz, which may 
be due to high power of the signals in this range. Therefore, time-domain ICA may be best capable 
of separating signals in this spectral region. Between about 6 Hz and 50 Hz, the residual fourth 
order correlations of time-domain ICA showed large fluctuations — near 27 Hz and 50 Hz component 
independence was close to that of the recorded signals. 

These findings indicate that additional degrees of freedom of the spectral-domain convolutivc 
mixing model (compared to the instantaneous mixing model) enable it to produce components with 
a higher degree of signal separation. If the underlying EEG processes had wide-band characteristics 
and no spatio-temporal dynamics, it would have been expected that all three algorithms performed 
equally well. Therefore, the assumption of a fully-complex frequency-specific mixing model appears 
to be supported by the resulting lower residual dependencies. 

3.3 Distance between component maps 

We further compared time-domain ICA and real-map constrained-complex ICA by computing, for 
every frequency / = !,..., 101, the distance d map (i,j, f) between the i-th component map of time- 
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Figure 7: Minimal mean distances -D ac t(/i,/2) computed from component activation functions ob- 
tained with the fully-complex ICA algorithm in 101 frequency bands of width 5.12 Hz, spaced 
equidistantly between Hz and 50 Hz in 0.5-Hz increments. Right: Distances for all best-matching 
component pairs of different frequencies. Left: Enlarged view of the 0-Hz to 20-Hz range. 

domain ICA and the j-th component map of complex ICA at frequency /, see section [2.51 Best- 
matching component maps were assigned for each / using the assignment method described in section 
12.4.31 yielding a minimal mean distance D(f) (analogous to Ea. I30[l. which is shown in Fig.EI 

Across all frequencies, the distance between component maps obtained by time-domain ICA 
and by constrained-complex spectral-domain ICA is at least 0.4. Largest distances are exhibited at 
frequencies of 30 Hz or higher, while the maps show closest resemblance around a minimum in the 
5- Hz to 10-Hz range. In conjunction with the results from section T3. 21 this may serve as a further 
indication that separation of EEG data by time-domain ICA may be dominated by low-frequency 
activity. 

3.4 Distance between component activations 

Distances between component activation time-courses Ui(T, /i) and Uj(T, fy) were computed for the 
fully-complex ICA separation according to Eq. (|27|) for all possible combinations of (£, fi,j, j^)- Best- 
matching components were assigned for each pair of frequencies (/i, $2) using the method presented 
in section [2.4.31 yielding one minimal mean distance -D a ct(/i>/2) for every frequency pair. The 
distances between all frequency pairs are visualized in Fig. (right panel). Note that the level of 
detail available in the visualized spectral features is in principle limited by the spectral resolution 
(5.12 Hz) of the windowing function employed in the time-frequency transformation. 

The distance matrix is dominated by values on the diagonal as expected from the bandwidth of 
the spectral decomposition. For larger spectral distances (away from the diagonal) several nearly 
rectangular distance patterns emerge, that deviate from the diagonal structure as expected in the 
absence of spectral clusters. Qualitatively, we may identify three spectral blocks, extending roughly 
from Hz to 8 Hz (corresponding to delta and theta bands), from 8 Hz to 30 Hz (alpha and beta 
bands), and from 30 Hz to at least 50 Hz (gamma band). 

Although the exact borders and shapes of the spectral clusters cannot be identified in the figure, 
the existence of spectral structure may reflect physiological processes that extend over some spectral 
range and thereby induce independent components with similarities across frequencies. Thus, the 
complex spectral-domain ICA method may serve as the starting point for extracting components 
with physiological relevance from EEG data. Whereas our present analysis of the clusters is based on 
the qualitative interpretation of the average component distances across frequencies, further analysis 
may employ quantitative clustering methods on individual (unaveraged) component distances and 
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should thereby produce a more detailed picture of component similarities across frequencies. 
3.5 Examples of maps and activations 

A large number of independent component maps and activations were obtained for different frequency 
bands. Relevance of the components may be assessed based on the compatibility of component maps 
and activations with known EEG physiology, experimental design and subject behavior. We show 
here one set of components whos e central-midline p r o jecti ons are similar to EEG activity associated 
with orienting to novel stimuli l|Courchesne et all Il975|) . The response of these components to 
stimulus presentation is most marked in the 5-Hz band and shows a clear relation to subject behavior. 

Figs. ISHTUI illustrate differences between the real infomax, real-map constrained-complex infomax 
and fully-complex infomax ICs. The real infomax IC (Fig. |8J shows a clear increase in power near 
the median response time at about 300 ms, and a strong mean phase resetting which is visible near 
300 ms as a phase-wrap (from — 7r to n) and as a peak in the ITC. 

The corresponding component obtained from real-map constrained-complex ICA at 5 Hz is dis- 
played in Fig- EI Its component map shares the spatial focus of maximum scalp projection with the 
time-domain IC map (cf. Fig. EJ, but the spatial extent of the projection appears different. Com- 
paring the complex activation time-courses, the real-map constrained-complex IC shows a stronger 
response-locked power increase near 300 ms which is also more closely linked to the response time 
(Fig. El center panel) , and shows a more consistent phase-resetting and higher ITC near 300 ms af- 
ter stimulus presentation (Fig. El right panel). This indicates that spectral-domain ICA may reflect 
subject behavior and underlying brain processes more faithfully than time-domain ICA. 

The real part map obtained by decomposing the 5- Hz band with the fully-complex ICA algorithm 
fFig. 1101 left) appears similar to the real-constrained component map. The corresponding imaginary 
part map fFig. 1101 second from left) has a non-negligible amplitude at the spatial focus of maximum 
scalp projection. This indicates the presence of spatio-temporal dynamics in the data, and that these 
dynamics are modeled better with complex maps than with static real maps. Here, the complex 
IC magnitude and phase activations (Fig. 1101 right) do not appear qualitatively different from the 
activations obtained with the real-map constrained-complex algorithm (Fig. El right), although (as 
we have shown in section the fully-complex ICA results in IC activations with a higher degree 
of independence than those obtained with real-map constrained-complex ICA. 

To illustrate the similarity of component maps over different spectral bands, Fig. llll displavs those 
maps from the 10-Hz to 30-Hz decompositions that best match the illustrated 5-Hz component. The 
maps in Fig. II II were obtained using the fully-complex ICA algorithm; only the magnitude maps are 
shown. While the site of maximum scalp projection remains similar, the maps exhibit differences 
in shape and spatial extent, further suggesting that the complex spectral-domain ICA algorithm 
models aspects of the data that real ICA algorithms ignore. 

Components not shown here are on the whole characterized by a single focus of activation in 
the associated scalp maps, which may indicate that their generators are located in spatially contin- 
uous (as opposed to disconnected) cortical regions. About half of the components display a clearly 
non-zero imaginary part in their scalp maps, corresponding to processes with spatio-temporal dy- 
namics. However, we also find component maps with imaginary parts that do not appear to deviate 
significantly from zero. Under the proposed model, these components correspond to static sources 
with negligible spatio-temporal dynamics. This demonstrates that the complex ICA algorithm does 
not necessarily produce complex component maps, but also extracts the special case of real-valued 
mixing systems when supported by the data. 
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Figure 8: Independent component at 5 Hz obtained from standard time-domain infomax ICA. Left: 
Scalp map. Middle: ERP-image of 5-Hz power. Right: ERP-image of complex-demodulated 5-Hz 
phase. Response times superimposed on data. Lower panels: Mean time-courses of event-related 
5-Hz power (middle) and 5-Hz intertrial coherence (ITC, right). 



Power Phase 




Figure 9: Independent component at 5 Hz obtained from real- map constrained-complex spectral- 
domain ICA. Same dataset as Fig. El Left: Scalp map. Middle: ERP-image of 5-Hz power. Right: 
ERP-image of complex-demodulated 5-Hz phase. Response time and lower panels analogous to Fig. 

El 



Power Phase 




Figure 10: Independent component at 5 Hz obtained from fully-complex spectral-domain ICA. Same 
dataset as Figs. 03 and El From left to right: Real and imaginary part of the complex scalp map, 
respectively; ERP-images of 5-Hz power and complex-demodulated 5-Hz phase of the complex IC 
activation time-courses, respectively. Response time and lower panels analogous to Fig. El 
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Figure 11: Magnitude maps of complex independent components obtained using the fully-complex 
spectral-domain ICA algorithm at five frequency bands, same dataset as Figs. 181101 



4 Discussion and conclusion 

We have presented a new method for the analysis of dynamic brain data and in particular electroen- 
cephalographic signals. The method is based on spectral decomposition of the sensor signals, and 
subsequent analysis within distinct spectral bands by means of a complex infomax algorithm for 
independent component analysis. 

Although the applicability of ICA to time-domain EEG data is well established, the results 
obtained from the EEG dataset presented here — together with results from EEG data not shown — 
strongly support the applicability of complex spectral-domain ICA to EEG modeling and analysis. 

Two different aspects of the method appear to offer improvements over previous ICA algorithms 
for modeling dynamic brain data. First, signal superposition is modeled as a convolution, permitting 
sources to exhibit spatio-temporal dynamics. Evidence for spatio-temporally dynamic patterns has 
been found in invas ive recordings in animal cortex and includes spatial propagation of neural activit y 
(jArieli et all ll9961. traveling waves l|FreemarAll975ULopez da Silva and Storm van Leeuwenlll978l) . 



-if- 

and phase shifted activity between different regions IjStein et alll200fy . Second, signal superposition 
may be frequency dependent, allowing for distinct signal sources at different frequencies. This view 
follows naturally from the conventional notion of different frequency bands in EEG that appear to 
be related to different physiological functions. 

The convolutive mixing assumption gained support from the decomposition results. The fully- 
complex spectral-domain ICA algorithm exhibited the lowest residual statistical dependencies, and 
many complex component maps showed clear non-negligible imaginary parts, indicating that complex 
ICA modeled spatio-temporal source dynamics in the data. These first steps in understanding 
the relation between complex independent components and underlying brain processes may be a 
qualitative step forward in modeling EEG data with ICA, a step that could potentially result in new 
insights into brain dynamics. 

The assumption of frequency-dependent signal mixing was supported in three ways. First, resid- 
ual statistical dependencies after separation were lower with complex frequency-dependent ICA than 
with real wide-band ICA. Second, component maps obtained with complex ICA varied across fre- 
quencies. Third, results of complex ICA included distinct spectral ranges exhibiting clusters of sim- 
ilar independent components, which might pertain to physiological processes with activity over the 
corresponding spectral bands. Compared to previous methods, our results indicate that improve- 
ments in analysis may be expected in the spectral range above 8 Hz, with largest improvements 
possible above 20 Hz, where the deviations between time-domain ICA and complex spectral-domain 
ICA results appear to be strongest. However, the example data presented also indicated an advan- 
tage for complex ICA in the 5-Hz (theta) band, where complex ICA produced a component whose 
activity was more reliably related to subject task behavior than the corresponding real time-domain 
ICA component. 

We have presented methods for assigning best-matching complex component pairs in different 
spectral bands to common sources. For these data, complex ICA produced physiologically plausible 
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component clusters. However, further methodological improvements could be explored including 
other measures of component similarity, assignment procedures and quantitative clustering methods. 

Other recording techniques, like the magnetoencephalogram (MEG) or functional magnetic res- 
onance imaging (fMRI), and other electrical recordings from the human body such as electromyo- 
graphic (EMG) and electrocardiographic (ECG) recordings, might also benefit from the presented 
methods. Should statistical and physiological analysis of those data indicate the applicability of 
complex spectral-domain ICA, new directions for research might be open for those fields. 

Several open questions regarding aspects of the presented methods should be investigated in 
further studies. 

• The present work focused on the frequency-domain related aspects of the algorithm. For a 
better understanding of the obtained components, it will be necessary to project them to 
the corresponding time-domain electrode voltages and study the resulting time- varying spatial 
distributions with respect to experimental design. It should be possible to validate the proposed 
method by performing experiments for which a-priori knowledge exists as to expected spatio- 
temporal dynamics of the scalp maps. 

• One benefit of the frequency-dependent mixing assumption is that it may enable identification 
of a higher number of stable independent components. However, each component obtained 
by the fully-complex algorithm can model only a single mode of spatio-temporal dynamics, 
corresponding to, e.g., a single direction of a cortical activation trajectory. Sources with 
(nearly) identical foci of activation but different spatio-temporal dynamics may therefore be 
decomposed by complex ICA into distinct components. Due to this higher sensitivity, the 
fully-complex algorithm could benefit from a larger number of EEG sensors. 

• The stability of components both within and across subjects is a related area for further studies. 
Variability of components with respect to, e.g., number of electrodes, recording- length, subject 
variability and variability across recording sessions should be investigated. 

• The spectral basis employed in the present algorithms may appear as a natural choice, and 
it has the advantage of allowing for the simultaneous analysis of frequency-dependent and 
convolutive mixing using a single mathematical model. However, other model choices may be 
possible and better adapted to the data than the present spectral basis. 

The experimental results presented here indicate that complex spectral-domain independent com- 
ponents model aspects of spatio-temporal dynamics in the data that real-valued independent com- 
ponents ignore. To support this possibility, we have shown one example showing spatio-temporal 
dynamics and a tighter relation of complex components to subject behavior. To confirm the rele- 
vance of the new method for understanding brain data, it is important to further investigate the 
physiological plausibility of the decompositions and their functional relation to behavior based on 
more extensive analysis across subjects and experiments. 
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